Direct Policy Search and Uncertain Policy Evaluation

نویسنده

Jieyu Zhao

چکیده

Reinforcement learning based on direct search in policy space requires few assumptions about the environment. Hence it is applicable in certain situations where most traditional reinforcement learning algorithms based on dynamic programming are not, especially in partially observable, deterministic worlds. In realistic settings, however, reliable policy evaluations are complicated by numerous sources of uncertainty, such as stochasticity in policy and environment. Given a limited life-time, how much time should a direct policy searcher spend on policy evaluations to obtain reliable statistics? Despite the fundamental nature of this question it has not received much attention yet. Our efficient approach based on the success-story algorithm (SSA) is radical in the sense that it never stops evaluating any previous policy modification except those it undoes for lack of empirical evidence that they have contributed to lifelong reward accelerations. Here we identify SSA’s fundamental advantages over traditional direct policy search (such as stochastic hill-climbing) on problems involving several sources of stochasticity and uncertaint),.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequential Classification-Based Optimization for Direct Policy Search

Direct policy search often results in high-quality policies in complex reinforcement learning problems, which employs some optimization algorithms to search the parameters of the policy for maximizing the its total reward. Classificationbased optimization is a recently developed framework for derivative-free optimization, which has shown to be effective and efficient for non-convex optimization...

متن کامل

The Host Iranian Economy and Foreign Direct Investment: A Comparative Analysis

In this paper fourteen selected developing economies, including Iran, are compared to evaluate the Iranian position in attracting foreign direct investment (FDI). The evaluation is based on economic performance, risk, liberalization policy, and FDI determinant indicators. The results show that the Iranian economy has a sound economic performance and its economic, financial, and political risks ...

متن کامل

Employment, Wages and Optimal Monetary Policy

We study optimal monetary policy when the empirical evidence leaves the policymaker uncertain whether the true data-generating process is given by a model with sticky wages or a model with search and matching frictions in the labor market. Unless the policymaker is almost certain about the search and matching model being the correct data-generating process, the policymaker chooses to stabilize ...

متن کامل

A Review of Evidence-Informed Policy Making in Sustainable Healthy Food and Nutrition Systems

Background and purpose: A safe food system provides the conditions for consumers to decide about and choose the food products. This systematic review describes the alternatives in order to achieve a healthy nutrition pattern in a food system that can be used to make changes in current policies. Materials and methods: An electronic literature search was done in Google Scholar, Web of Science, P...

متن کامل

Providing Value to New Health Technology: The Early Contribution of Entrepreneurs, Investors, and Regulatory Agencies

Background New technologies constitute an important cost-driver in healthcare, but the dynamics that lead to their emergence remains poorly understood from a health policy standpoint. The goal of this paper is to clarify how entrepreneurs, investors, and regulatory agencies influence the value of emerging health technologies. Methods Our 5-year qualitative research program examined the proces...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Direct Policy Search and Uncertain Policy Evaluation

نویسنده

چکیده

منابع مشابه

Sequential Classification-Based Optimization for Direct Policy Search

The Host Iranian Economy and Foreign Direct Investment: A Comparative Analysis

Employment, Wages and Optimal Monetary Policy

A Review of Evidence-Informed Policy Making in Sustainable Healthy Food and Nutrition Systems

Providing Value to New Health Technology: The Early Contribution of Entrepreneurs, Investors, and Regulatory Agencies

عنوان ژورنال:

اشتراک گذاری